Combining Resources for Open Source Machine Translation
نویسندگان
چکیده
In this paper, we present a Japanese→English machine translation system that combines rule-based and statistical translation. Our system is unique in that all of its components are freely available as open source software. We describe the development of the rule-based translation engine including transfer rule acquisition from an open bilingual dictionary. We also show how translations from both translation engines are combined through a simple ranking mechanism and compare their outputs.
منابع مشابه
The Prague Bulletin of Mathematical Linguistics Free/open-source Resources in the Apertium Platform for Machine Translation Research and Development
This paper describes the resources available in the Apertium platform, a free/open-source framework for creating rule-based machine translation systems. Resources within the platform take the form of finite-state morphologies for morphological analysis and generation, bilingual transfer lexica, probabilistic part-of-speech taggers and transfer rule files, all in standardised formats. These reso...
متن کاملJane: Open Source Machine Translation System Combination
Different machine translation engines can be remarkably dissimilar not only with respect to their technical paradigm, but also with respect to the translation output they yield. System combination is a method for combining the output of multiple machine translation engines in order to take benefit of the strengths of each of the individual engines. In this work we introduce a novel system combi...
متن کاملFree/Open-Source Resources in the Apertium Platform for Machine Translation Research and Development
This paper describes the resources available in the Apertium platform, a free/open-source framework for creating rule-basedmachine translation systems. Resources within the platform take the form of finite-state morphologies for morphological analysis and generation, bilingual transfer lexica, probabilistic part-of-speech taggers and transfer rule files, all in standardised formats. These resou...
متن کاملAutomatic Acquisition of Machine Translation Resources in the Abu-MaTran Project
This paper provides an overview of the research and development activities carried out to alleviate the language resources’ bottleneck in machine translation within the Abu-MaTran project. We have developed a range of tools for the acquisition of the main resources required by the two most popular approaches to machine translation, i.e. statistical (corpora) and rule-based models (dictionaries ...
متن کاملHoly Moses! Leveraging Existing Tools and Resources for Entity Translation
Recently, there has been an emphasis on creating shared resources for natural language processing applications. This has resulted in the development of high-quality tools and data, which can then be leveraged by the research community as components for novel systems. In this paper, we reuse an open source machine translation framework to create an Arabic-to-English entity translation system. Th...
متن کامل